AITopics

2509.09183

Country: Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Chen, Dangxing, Ye, Weicheng

Generalized Gloves of Neural Additive Models: Pursuing transparent and accurate machine learning models in finance

arXiv.org Artificial IntelligenceSep-20-2022

For many years, machine learning methods have been used in a wide range of fields, including computer vision and natural language processing. While machine learning methods have significantly improved model performance over traditional methods, their black-box structure makes it difficult for researchers to interpret results. For highly regulated financial industries, transparency, explainability, and fairness are equally, if not more, important than accuracy. Without meeting regulated requirements, even highly accurate machine learning methods are unlikely to be accepted. We address this issue by introducing a novel class of transparent and interpretable machine learning algorithms known as generalized gloves of neural additive models. The generalized gloves of neural additive models separate features into three categories: linear features, individual nonlinear features, and interacted nonlinear features. Additionally, interactions in the last category are only local. The linear and nonlinear components are distinguished by a stepwise selection algorithm, and interacted groups are carefully verified by applying additive separation criteria. Empirical results demonstrate that generalized gloves of neural additive models provide optimal accuracy with the simplest architecture, allowing for a highly accurate, transparent, and explainable approach to machine learning.

artificial intelligence, deep learning, machine learning, (19 more...)

2209.10082

Country:

Asia > Taiwan (0.04)
North America > United States (0.04)
Asia > China > Jiangsu Province (0.04)
Asia > Bangladesh (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Government (1.00)
Banking & Finance > Credit (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJan-31-2022

JULIA: Joint Multi-linear and Nonlinear Identification for Tensor Completion

Qian, Cheng, Huang, Kejun, Glass, Lucas, Srinivasa, Rakshith S., Sun, Jimeng

Tensor completion aims at imputing missing entries from a partially observed tensor. Existing tensor completion methods often assume either multi-linear or nonlinear relationships between latent components. However, real-world tensors have much more complex patterns where both multi-linear and nonlinear relationships may coexist. In such cases, the existing methods are insufficient to describe the data structure. This paper proposes a Joint mUlti-linear and nonLinear IdentificAtion (JULIA) framework for large-scale tensor completion. JULIA unifies the multi-linear and nonlinear tensor completion models with several advantages over the existing methods: 1) Flexible model selection, i.e., it fits a tensor by assigning its values as a combination of multi-linear and nonlinear components; 2) Compatible with existing nonlinear tensor completion methods; 3) Efficient training based on a well-designed alternating optimization approach. Experiments on six real large-scale tensors demonstrate that JULIA outperforms many existing tensor completion algorithms. Furthermore, JULIA can improve the performance of a class of nonlinear tensor completion methods. The results show that in some large-scale tensor completion scenarios, baseline methods with JULIA are able to obtain up to 55% lower root mean-squared-error and save 67% computational complexity.

completion, dataset, tensor, (16 more...)

2202.00071

Country:

North America > United States > California (0.14)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > Florida (0.04)

Genre: Research Report (0.84)

Industry:

Government (0.68)
Health & Medicine (0.49)
Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Rogers, Timothy J., Friis, Tobias

A Latent Restoring Force Approach to Nonlinear System Identification

arXiv.org Machine LearningSep-22-2021

Identification of nonlinear dynamic systems remains a significant challenge across engineering. This work suggests an approach based on Bayesian filtering to extract and identify the contribution of an unknown nonlinear term in the system which can be seen as an alternative viewpoint on restoring force surface type approaches. To achieve this identification, the contribution which is the nonlinear restoring force is modelled, initially, as a Gaussian process in time. That Gaussian process is converted into a state-space model and combined with the linear dynamic component of the system. Then, by inference of the filtering and smoothing distributions, the internal states of the system and the nonlinear restoring force can be extracted. In possession of these states a nonlinear model can be constructed. The approach is demonstrated to be effective in both a simulated case study and on an experimental benchmark dataset.

identification, nonlinear, nonlinear model, (13 more...)

2109.10681

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Guo, Mengzhuo, Zhang, Qingpeng, Liao, Xiuwu, Zeng, Daniel Dajun

An interpretable neural network model through piecewise linear approximation

arXiv.org Artificial IntelligenceJan-20-2020

Most existing interpretable methods explain a black-box model in a post-hoc manner, which uses simpler models or data analysis techniques to interpret the predictions after the model is learned. However, they (a) may derive contradictory explanations on the same predictions given different methods and data samples, and (b) focus on using simpler models to provide higher descriptive accuracy at the sacrifice of prediction accuracy. To address these issues, we propose a hybrid interpretable model that combines a piecewise linear component and a nonlinear component. The first component describes the explicit feature contributions by piecewise linear approximation to increase the expressiveness of the model. The other component uses a multi-layer perceptron to capture feature interactions and implicit nonlinearity, and increase the prediction performance. Different from the post-hoc approaches, the interpretability is obtained once the model is learned in the form of feature shapes. We also provide a variant to explore higher-order interactions among features to demonstrate that the proposed model is flexible for adaptation. Experiments demonstrate that the proposed model can achieve good interpretability by describing feature shapes while maintaining state-of-the-art accuracy.

interaction, neural network, prediction, (14 more...)

2001.07119

Country:

North America > United States > Arizona (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Tay, J. Kenneth, Tibshirani, Robert

Reluctant additive modeling

arXiv.org Machine LearningDec-4-2019

Sparse generalized additive models (GAMs) are an extension of sparse generalized linear models which allow a model's prediction to vary non-linearly with an input variable. This enables the data analyst build more accurate models, especially when the linearity assumption is known to be a poor approximation of reality. Motivated by reluctant interaction modeling (Yu et al. 2019), we propose a multi-stage algorithm, called $\textit{reluctant additive modeling (RAM)}$, that can fit sparse generalized additive models at scale. It is guided by the principle that, if all else is equal, one should prefer a linear feature over a non-linear feature. Unlike existing methods for sparse GAMs, RAM can be extended easily to binary, count and survival data. We demonstrate the method's effectiveness on real and simulated examples.

modeling, nonlinear component, nonlinear feature, (16 more...)

1912.01808

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.94)

arXiv.org Machine LearningJun-4-2019

An interpretable machine learning framework for modelling human decision behavior

Guo, Mengzhuo, Zhang, Qingpeng, Liao, Xiuwu, Chen, Youhua

Machine learning has recently been widely adopted to address the managerial decision making problems. However, there is a trade-off between performance and interpretability. Full complexity models (such as neural network-based models) are non-traceable black-box, whereas classic interpretable models (such as logistic regression) are usually simplified with lower accuracy. This trade-off limits the application of state-of-the-art machine learning models in management problems, which requires high prediction performance, as well as the understanding of individual attributes' contributions to the model outcome. Multiple criteria decision aiding (MCDA) is a family of interpretable approaches to depicting the rationale of human decision behavior. It is also limited by strong assumptions (e.g. preference independence). In this paper, we propose an interpretable machine learning approach, namely Neural Network-based Multiple Criteria Decision Aiding (NN-MCDA), which combines an additive MCDA model and a fully-connected multilayer perceptron (MLP) to achieve good performance while preserving a certain degree of interpretability. NN-MCDA has a linear component (in an additive form of a set of polynomial functions) to capture the detailed relationship between individual attributes and the prediction, and a nonlinear component (in a standard MLP form) to capture the high-order interactions between attributes and their complex nonlinear transformations. We demonstrate the effectiveness of NN-MCDA with extensive simulation studies and two real-world datasets. To the best of our knowledge, this research is the first to enhance the interpretability of machine learning models with MCDA techniques. The proposed framework also sheds light on how to use machine learning techniques to free MCDA from strong assumptions.

artificial intelligence, machine learning, marginal value function, (17 more...)

1906.01233

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Epidemiology (0.68)
Health & Medicine > Consumer Health (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Habibnia, Ali, Maasoumi, Esfandiar

Forecasting in Big Data Environments: an Adaptable and Automated Shrinkage Estimation of Neural Networks (AAShNet)

arXiv.org Machine LearningApr-24-2019

This paper considers improved forecasting in possibly nonlinear dynamic settings, with high-dimension predictors ("big data" environments). To overcome the curse of dimensionality and manage data and model complexity, we examine shrinkage estimation of a back-propagation algorithm of a deep neural net with skip-layer connections. We expressly include both linear and nonlinear components. This is a high-dimensional learning approach including both sparsity L1 and smoothness L2 penalties, allowing high-dimensionality and nonlinearity to be accommodated in one step. This approach selects significant predictors as well as the topology of the neural network. We estimate optimal values of shrinkage hyperparameters by incorporating a gradient-based optimization technique resulting in robust predictions with improved reproducibility. The latter has been an issue in some approaches. This is statistically interpretable and unravels some network structure, commonly left to a black box. An additional advantage is that the nonlinear part tends to get pruned if the underlying process is linear. In an application to forecasting equity returns, the proposed approach captures nonlinear dynamics between equities to enhance forecast performance. It offers an appreciable improvement over current univariate and multivariate models by RMSE and actual portfolio performance.

artificial intelligence, deep learning, machine learning, (16 more...)

1904.11145

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Büyükşahin, Ümit Çavuş, Ertekin, Şeyda

Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition

arXiv.org Machine LearningDec-30-2018

Many applications in different domains produce large amount of time series data. Making accurate forecasting is critical for many decision makers. Various time series forecasting methods exist which use linear and nonlinear models separately or combination of both. Studies show that combining of linear and nonlinear models can be effective to improve forecasting performance. However, some assumptions that those existing methods make, might restrict their performance in certain situations. We provide a new Autoregressive Integrated Moving Average (ARIMA)-Artificial Neural Network(ANN) hybrid method that work in a more general framework. Experimental results show that strategies for decomposing the original data and for combining linear and nonlinear models throughout the hybridization process are key factors in the forecasting performance of the methods. By using appropriate strategies, our hybrid method can be an effective way to improve forecasting accuracy obtained by traditional hybrid methods and also either of the individual methods used separately.

dataset, hybrid method, time series data, (14 more...)

1812.11526

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.68)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (1.00)
Banking & Finance (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-3-2012

Validation of nonlinear PCA

Scholz, Matthias

Linear principal component analysis (PCA) can be extended to a nonlinear PCA by using artificial neural networks. But the benefit of curved components requires a careful control of the model complexity. Moreover, standard techniques for model selection, including cross-validation and more generally the use of an independent test set, fail when applied to nonlinear PCA because of its inherent unsupervised characteristics. This paper presents a new approach for validating the complexity of nonlinear PCA models by using the error in missing data estimation as a criterion for model selection. It is motivated by the idea that only the model of optimal complexity is able to predict missing values with the highest accuracy. While standard test set validation usually favours over-fitted nonlinear PCA models, the proposed model validation approach correctly selects the optimal model complexity.

complexity, nonlinear pca, validation, (12 more...)

doi: 10.1007/s11063-012-9220-6

1204.0684

Country:

North America > United States > New York (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)